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1. INTRODUCTION 

An increasing use of technology and the advanced development of information technology require a 
large capacity of bandwidth and mobile services. These lead to the badwidth fluctuation particularly those 
that use video transmission. Furthermore, wireless channel condition and bandwidth availability determine 
the quality of video transmission. One of the attractive methods to solve the aforementioned problems in 
video transmission over wireless channel is layering transmission method. In this method, bit streams are 
scaled into a number of layer which consists of base and enhancement layers.. This is known as scalable 
video method. This method is also implemented in multicast network where receivers only receive video 
transmission as needed [1]. 

A systematic review of the vision of scalable video coding (SVC) over broadband is given in [1]. 
Moreover the application of the performances of SVC and its applications is shown in [2]-[4]. SVC is known 
in three basics which are quality scalable (SNR scalable), temporal scalable, and spatial scalable. The 
combination of these basics, known as Combined Scalable Video Coding (CSVC), is carried out by 
combining the advantage of each one to transmit video when bit rate is prioritized and SNR scalable is used 
when quality is preferred. The advantage of CSVC lies in the solution that it provides for video transmission 
from various inputs in heterogeneous networks with multiple terminals having different capacities. 

Most research used SVC with JSVM but not the combination of different scalable methods. This 
will affect the quality of end to end system. The fore, in this research, we use the combination of SVC 
(combined scalable video coding). This will be more flexible for adaptation to input characters, fluctuation of 
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channel transmission, and multicast on network conditions. There are several researchers that proposed the 
use of SCVS such as hybrid temporal-FGS (fine granular scalability) by Mihaela [5] and spatio-temporal by 
Blaszak [6]. In this research, we extend the CSVC method proposed by Schwarz [7]. The current application 
of SVC on broadband services has been proposed by Maodong Li [8]. There is a limited number of reported 
works on evaluation of video transmission in wireless broadband networks. Thus far the research community 
in the performance of evaluating video transmission on the network uses Evalvid [9] as popular scheme and 
tools. Evalvid limitation which only supports MPEG standard has encouraged the scheme, new methods and 
tools developed and refined by various parties. In [10], the proposed myEvalvid is seen to be capable to run 
H.264, but have not been yet able to run SVC. 

There are several schemes that utilize SVC as a public evaluation platform and based on open- 
source program. They are, SVEF (Scalable Video coding Evaluation Framework) [11], EvalSVC (Evaluation 
of SVC) [12], and myEvalSVC [13]. In [11], the proposed SVEF schemes and tools are built on open-source 
basis. SVEF is a mixed online/offline open-source framework devised to evaluate the performance of H.264 
SVC video streaming. SVEF is written in C and Python and also will be released under the GNU General 
Public License. SVEF has been able to evaluate basic SVC. However, it is still incomplete and inadequate in 
analyzing complete SVC. In addition, SVEF construction may not easy since it requires specialized skills. 
EvalSVC method and scheme proposed in [12] are publicly available and free. However, EvalSVC does not 
clearly indicate how to handle missing or corrupted packets. It is then updated and extended in [13] with 
myEvalSVC which is capable to run H.264 AVC and SVC in general. Nevertheless, this scheme has not been 
explored and developed to the maximum so that the package has not been able to evaluate the CSVC. To 
overcome those limitations, we propose a new scheme in this paper. The scheme provides a tool-set to 
evaluate and analyze CSVC in a wireless broadband network on an end-to-end system. 

This scheme is developed and distributed on an open-source platform to enable researcher and 
developer to use, analyze, and enrich it according to their requirements and needs [14]. To obtain the 
comprehensive analysis, the combination of scalable method is needed. However, to the best of our 
knowledge, there is no research on evaluation SVC using the combination of scalable method, which is a key 
in evaluation the quality of video transmission over broadband network. In this research, we compare the 
MGS and CGS, which is an extension of our previous [15], [16]. 

The scheme of framework can evaluate performance of average Peak Signal-to- Noise Ratio 
(PSNR), coding efficiency, performance of metrics systems as end-to-end delay, and queuing delay of the 
packets in the network. Furthermore, the scheme is simpler than other schemes in prior work which used to 
evaluate of video transmission over wireless broadband network (it will be explained in more detail in the 
next section of this paper). The proposed scheme is a promising approach for evaluating video transmission 
over wireless broadband services in the future. We believe that this new scheme will facilitate and simplify 
the evaluation and exploitation of video packets over wireless broadband network which is expected to 
improve research in this field. 

The rest of the paper is organized as follows: Section 2 presents the CSVC evaluation framework, 
CSVC method, MGS and CGS modes, and Network Simulator over wireless broadband network. In 
Section 3, we describe the experimental results and discussion, and finally we present conclusion and future 
work. 


2. THE COMBINED SCALABLE VIDEO CODING (CSVC) EVALUATION FRAMEWORK 
2.1. Combined scalable video coding (CSVC) 

SVC is a part of H.264/MPEG-4 part 10 AVC (Advanced Video Coding), or H.264/AVC [1]-[3]. 
H.262 and MPEG-2, followed by H.263+ and MPEG-4. Since January 2005, MPEG and VCEG have been 
joined together in JVT to carry out completion of the H.264/AVC amendment as an official standard [2]. 
Until now, H.264/AVC standard is still in amendment, and it is a cooperative work of many parties to 
establish Multiview Video Coding and High Efficiency Video Coding (HEVC) or H.265. 

Scalability proposed for the first time is to reduce packet (cell) loss in ATM networks [1]. It divides 
one bitstream into several sub-bitstreams or layers according to network state, which can offer efficiency and 
superior quality video coding on broadband services compared to other video coding standards. It created two 
groups of layers which are base and enhancement layers. The base layer contains vital information whereas 
the enhancement layer comprises residual information to enhance the quality of video sequences. A SVC 
system consists of encoder and decoder is depicted in Figure 1. Within data transmission process, base-layer 
containing vital information still makes it through in case of congestion in transmission channel. Figure 1 
shows the three types of scalability methods: SNR (Signal-to-Noise Ratio) Scalability, Spatial Scalability, 
and Temporal Scalability. In addition, there is the other scalability method that combines these three 
methods. When the inter layer resolution is changed, the spatial scalable will be dominant. When the quality 
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of video is changed, SNR scalable is dominant. When the bit rate is changed, the temporal scalability is 
dominant. The combination of those three is likely to due to varying in sequences characteristics, fluctuated 
network condition and multi terminals [1], [5], [7]. This research utilizes three layers of combined scalable: 
one base layer and two enhancement layers as given in Figure 1. 
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Figure 1. Scalable video coding systems 


2.2. MGS and CGS 

In this work, we focus on the quality scalability in H.264 SVC. Layer of quality scalable has the 
same spatio-temporal resolution but different fidelity. The H.264 SVC extension supports two quality 
scalable modes namely coarse grain scalability (CGS) and medium grain scalability (MGS) [2], [17]. Each 
spatial layer is referred to as CGS or MGS. CGS can be viewed as a special case of spatial scalability in 
H.264 SVC. In similar encoding mechanisms, CGS is employed but the spatial resolution will remain 
constant. More specifically, similar to spatial scalability, CGS employs inter-layer prediction mechanisms, 
such as prediction of macro block modes and associated motion parameters and prediction of the residue 
signal. CGS differs from spatial scalability where up-sampling operations are not performed. In CGS, the 
residual texture signal in the enhancement layer is re-quantized with a smaller quantization step size than the 
preceding CGS layer. SVC supports up to eight CGS layers fit to eight quality extraction points, namely one 
base layer and up to seven enhancement layers. If quality scalability is using CGS, switching different CGS 
layers must be done at the defined point. However, if the quality scalability is using MGS, switching different 
MGS layer can be done in any access unit. While CGS provides quality scalability by dropping complete 
enhancement layers, MGS provides a better granularity level of quality scalability by partitioning a given 
enhancement layer into several MGS layers. Individual MGS layers can then be dropped for quality (and bit 
rate) adaptation. With the MGS concept, any enhancement layer NAL (Network Abstraction Layer) unit can 
be discarded from a quality scalable bit stream. 


2.3. Network simulator II (NS-2) for evaluation transmission video 

The difficulties on research in video mainly arise from lack of tools and testing system environment 
(test bed) which can be accommodated by using network simulator. This paper describes about how to 
analyze video transmission using testbed on Network Simulator II (Version 2), widely known as NS-2 [18]. It 
is an event-driven tool which is useful for studying the dynamics of communication network. This test bed 
has simplicity in supporting researches on video transmission field [19]. NS-2 is very helpful for researchers 
in developing their works on many fields that constraint by laboratory tools or experiment related to 
networks. Video transmission over wireless network requires tools and complex technical environment. In 
this work, we have adopted and modified scenario networks in NS-2 (WLAN 802.11le) [20] from 
C. H. Ke [13]. 


3. SYSTEM MODEL OF EVALUATION FRAMEWORK 
The proposed evaluation framework scheme is shown as in Figure 3 where it shows the simulation 
of bit rate; PSNR (Peak Signal-to-Noise Ratio); end-to-end delay; and queue length of packets. To analyze 
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the outputs, we use MFC YUV viewer software and VLC media player. Source code of JSVM version 9.18 
[21], runs on Linux Fedora 16. The main component of this framework evaluating as show in Figure 3 are as 
follows in Table 2. External tools from SVEF [11] adopted in this scheme as in Figure 2. 
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Figure 2. Novel approach of scheme for evaluating video transmission using CSVC with NS-2 
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4. RESULTS AND ANALYSIS 

In this research, we propose a novel scheme, where CSVC is implemented over wireless broadband 
network WLAN JEEE 802.1le on NS-2 [17]. Figure 2 shows research scheme for evaluation framework 
overall. In this research, we use CGS and MGS modes of JSVM. In the experiment we use JSVM version 
9.18 [21] and NS-2 version 2.29. Computer simulations are done to verify the performance of this new 
scheme. The parameters and components/tools of the scheme are given in Table 1. 


Table 1. Parameters and Components/Tools in the evaluation frameworks 
Parameters and Components/Tools Description 
Input and output video sequences Stefan (90 frames); 
Bus (150 frames) 
Foreman (300 frames); 


City (300 frames) 
Encoder and Decoder od CSVC Using JSVM version 9.18 [21] 
BitStream Extractor CSVC Part of JSVM 
NS-2 Version 2.29 [18] 
SVEF Version 1.4 and external tool [11] 
PSNR Analyzer External tool and optional 
Spatial Scalable QCIF; CIF 
GOP (Group of Picture) 4 frames 
QP (Quantization Parameter) 18; 20; 24; 32 
Motion Search Range +32 
Wireless Standars WLAN 802.11e [20] 
Data rate 1 Mbps 
OS (Operating Systems) Linux Fedora 16 


4.1. Video quality parameters 

Testing and analysis of video input are Stefan sequences, Bus sequences, Foreman sequences, and 
City sequences. We use the Peak Signal-to-Noise Ratio (PSNR) to measure the quality between original 
sequence and reconstructed sequence. This metric employed is the Mean Squared Error (MSE) given by 


MSE = wr Deco Lino |f (z,y) — g(s, y), (1) 


and PSNR is defined as 


moai 2 
PSNR = 10108, (Cnt (2) 
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where W is number of pixel/row, H is number of row/frame, f(x; y) is luminance intensity of pixel in the 
original frame, g(x; y) is luminance intensity of pixel in the reconstructed frame, and n is number of bits per 
pixel. In analyzing PSNR, we use the parameters of Y-PSNR (component of the video luminance) that are 
commonly used in the analysis of video transmission. Calculation and analysis of bit rate uses equation is 
given by 


_ Np 


—— Mf, 
B, N; Ly 


(3) 


where Br is bit rate, Nb is total bit, Nf is number of frame, and Mf is mean frame rate. Input video sequence 
has been analyzed by quantization parameters (QP) 20, 24, 28, and 32. In the calculation and analysis of end- 
to-end delay and queue system, used equations such as packet transmission time given by 


Spacke t 
Ber 


Tpacke = 


(4) 


where Tpacket is packet transmission time, Spacket is packet size, and Btr is bandwidth transmission. The 
analysis of the total delay is including of the propagation delay on the network. 


4.2. Performance evaluation of framework 
The main parameters used to evaluate framework performance are described as follows: 


4.2.1. Bitrate and efficiency coding 

In this research, the coding efficiency of encoder CSVC from Stefan sequence, Bus sequences, 
Foreman sequences, and City sequences are analyzed. This functions as an input as given in Figure 3. We 
follow JSVM algorithm to operate encoding. The decoding output of CSVC is given in Fig 4. It shows that 
on layer 3, MGS mode gains PSNR 4 dB above CGS mode. Furthermore, it can be seen that there is no 
significant difference. Whereas, layer | there is 2.4 dB to 3.1 dB differences. This study utilize two modes: 
CGS and MGS in which each mode uses three layers of CSVC. It can be seen from layer 3 that MGS mode 
gains Y-PSNR about 2.2 dB above CGS mode at bit rate 1700 kbps. On the other hand, at Y-PSNR 42 dB, 
MGS mode has a gap about 500 kbps above CGS mode. On layer 2, it can be seen that there is not significant 
different. Whereas, it can be seen from layer 1 that there is 2.6 dB to 3.2 dB differences. There is a significant 
gap between CGS and MGS as shown in layer 3. In the case of Stefan sequence (Figure 3(a)) and BUS 
sequence (Figure 3(b)) as input, starts from 1000 kbps (1 Mbps) bitrate has shown the gap, where the gap is 
2.2 dB on 3700 kbps bitrate. In case of Foreman sequence (Figure 3(c)) and City sequence (Figure 3(d)) as 
input, from 800 kbps bitrate shows the gap, up to about 2700 kbps bitrate shows around 1.2 dB gap. In 
contrast to the layer 1 and layer 2, it is not seen significant gap between CGS and MGS, the Stefan sequence 
and the BUS sequence as input below 1 Mbps while Foreman sequence and City sequence as input below 
800 kbs. We briefly note that MGS overcome PSNR performance to CGS at the same bitrate. Increasing 
number of frame sequence to be transmitted will cause Y-PSNR to have decreasing gap. We briefly note that 
MGS overcome PSNR performance to CGS at the same bitrate. Increasing the frame sequence to be 
transmitted causes Y-PSNR gap decreased. 
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Figure 3. Performance of coding efficiency component of encoding 


4.2.2. PSNR as performance framework overall 

Figure 4 Stefan sequence shows that the average Y-PSNR as output of the new scheme, where CGS 
has value of 16.8 dB, and MGS 20 dB. It is around 3.3 dB gap. In Figure 4(b) on Bus sequence, where CGS 
has value of 11.9 dB, and MGS 14.6 dB. It has around 2.8 dB gap. In Figure 4(c) Foreman sequence, where 
CGS has value of 23.3 dB, and MGS 25.7 dB. It has around 2.45 dB gap. In Figure 4(d) City sequence, it 
shows that CGS has value of 23.5 dB, and MGS has the value of 24.8 dB. It has around 1.4 dB gap. We can 
conclude that MGS is better than CGS of approximately 1.4 dB up to 3.1 dB. However from the ITU 
reference [22], it showed that the results mentioned above still exist under the fair category (PSNR below 
25 dB). The only worth more than 25 dB is MGS mode on the input of Foreman sequence. This shows that 
the error-resilience and the error-concealment has not running well where packet/frame error is still a lot 
going on. NS-2 environment affects the results, so that the development it is a challenge for the future 
research. 


Int J Elec & Comp Eng, Vol. 8, No. 5, October 2018 : 3407 — 3416 


Int J Elec & Comp Eng ISSN: 2088-8708 O 3413 


= $ -CGS 

—6— MGS J 
g fe | g 
FA ` 7 FA 
é ieh ii ẹ 
> > 


10 20 30 40 50 60 70 80 90 50 
Number of Frame Number of Frame 


(a) Stefan sequence (b) Bus sequence 


35 


30 


Y-PSNR (dB) 
& 
Y-PSNR (dB) 


20 


50 100 150 200 250 300 
Number of Frame Number of Frame 


(c) Foreman sequence (d) City sequence 


Figure 4. Performance of PSNR metrics of the proposed scheme 


4.2.3. End-to-end delay 

Parameter of end-to-end delay depends on the characteristics of the input sequence to be 
transmitted. CGS is relatively more stable as input Stefan sequence (as show Figure 5(a)) and the Bus 
sequence (as show Figure 5(b)). It shows that the packet delivery is delayed to 100 packets in 0.8 seconds, 
while in MGS, end-to-end delays is increasing in linear. On the input of Foreman sequence (as show 
Figure 5(c)) and the City sequence (as show Figure 5(d)), CGS has an average delay of about 0.7 seconds, 
while MGS has an average delay of about 0.2 seconds. It indicates that MGS has less end-to-end delay and 
better than CGS. The more packets transmitted in the network, the gap of delay occurs between CGS and 
MGS is higher. 


4.2.4. Queue length systems 

In this study we limit the system to the maximum packet queue of 50 packets in the buffer due to the 
computational efficiency issues. We observe from Figure 6 that the Foreman sequence (Figure 6(a)) and City 
sequence (Figure 6(b)) show that CGS has less time than MGS. However, in Foreman sequence (Figure 6(c)) 
and City sequence (Figure 6(d)) as the inputs which have more number of frames show that queuing length of 
MGS is less than CGS. Increasing number of frames sequence in transmission will cause queuing of packets 
takes longer time. In summary, we conclude that CGS and MGS mode on video transmission over wireless 
broadband network can be implemented on NS-2 well. The use of MGS mode is more satisfactory compared 
to CGS mode. 
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Figure 5. Performance metrics systems on the end-to-end delay 
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5. CONCLUSION 

In this research, we have proposed a novel approach for platform evaluation of video transmission 
based on the CSVC over wireless broadband network (WLAN IEEE 802.11le), simulated by NS-2. 
Furthermore, we also have investigated the effects of the application of MGS and CGS modes on 
performance of this system. The application of MGS mode on CSVC increases the performance of the 
system in comparison to CGS mode. In general, our scheme is simple and easily implemented for evaluating 
video transmission over wireless broadband network. 

The future work will focus on providing CSI (Channel State Information) using limited feedback 
method for new scheme on adaptive system for video transmission over wireless broadband network services 
such WiMAX or LTE. Moreover, we will extend and develop our future results by implementing NS-3 and 
then compare the results to the real experiment. 
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